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Abstract 

The repeatability and efficiency of a corner detector determines how Hkely it is to be useful in 
a real-world application. The repeatability is importand because the same scene viewed from different 
positions should yield features which correspond to the same real- world 3D locations H]. The efficiency 
is important because this determines whether the detector combined with further processing can operate 
at frame rate. 

Three advances are described in this paper First, we present a new heuristic for feature detection, 
and using machine learning we derive a feature detector from this which can fully process live PAL 
video using less than 5% of the available processing time. By comparison, most other detectors cannot 
even operate at frame rate (Harris detector 115%, SIFT 195%). Second, we generaUze the detector, 
allowing it to be optimized for repeatability, with little loss of efficiency. Third, we carry out a rigorous 
comparison of corner detectors based on the above repeatability criterion applied to 3D scenes. We 
show that despite being principally constructed for speed, on these stringent tests, our heuristic detector 
significantly outperforms existing feature detectors. Finally, the comparison demonstrates that using 
machine learning produces significant improvements in repeatability, yielding a detector that is both 
very fast and very high quality. 

Index Terms 

Corner detection, feature detection. 

I. Introduction 

Corner detection is used as the first step of many vision tasks such as tracking, localisation, 
SLAM (simultaneous localisation and mapping), image matching and recognition. This need has 
driven the development of a large number of corner detectors. However, despite the massive 
increase in computing power since the inception of corner detectors, it is still true that when 
processing live video streams at full frame rate, existing feature detectors leave little if any time 
for further processing. 

In the applications described above, comers are typically detected and matched into a database, 
thus it is important that the same real-world points are detected repeatably from multiple views 
[[H. The amount of variation in viewpoint under which this condition should hold depends on 
the application. 
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II. Previous work 

A. Corner detectors 

Here we review the literature to place our advances in context. In the literature, the terms 
"point feature", "feature", "interest point" and "comer" refer to a small point of interest with 
variation in two dimensions. Such points often arise as the result of geometric discontinuities, 
such as the comers of real world objects, but they may also arise from small patches of texture. 
Most algorithms are capable of detecting both kinds of points of interest, though the algorithms 
are often designed to detect one type or the other. A number of the detectors described below 
compute a corner response, C, and define comers to be large local maxima of C. 

1) Edge based corner detectors: An edge (usually a step change in intensity) in an image 
corresponds to the boundary between two regions. At corners, this boundary changes direction 
rapidly. 

a) Chained edge based corner detectors: Many techniques have been developed which 
involved detecting and chaining edges with a view to analysing the properties of the edge, often 
taking points of high curvature to be corners. Many early methods used chained curves, and 
since the curves are highly quantized, the techniques concentrate on methods for effectively and 
efficiently estimating the curvature. A common approach has been to use a chord for estimating 
the slope of a curve or a pair of chords to find the angle of the curve at a point. 

Early methods computed the smallest angle of the curve over chords spanning different 
numbers of links. Comers are defined as local minima of angle Q after local averaging [[3]. 
Alternatively, corners can be defined as isolated discontinuities in the mean slope, which can be 
computed using a chord spanning a fixed set of links in the chain flU. Averaging can be used to 
compute the slope and the length of the curve used to determine if a point is isolated [[51. The 
angle can be computed using a pair of chords with a central gap, and peaks with certain widths 
(found by looking for zero crossings of the angle) are defined as corners [HI. 

Instead of using a fixed set of chord spans, some methods compute a 'region of support' 
which depends on local curve properties. For instance local maxima of chord lengths can be 
used to define the region of support, within which a corner must have maximal curvature [[3. 
Corners are can be defined as the centre of a region of support with high mean curvature, where 
the support region is large and symmetric about its centre [[H. The region free from significant 
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discontinuities around the candidate point can be used with curvature being computed as the 
slope change across the region ^ or the angle to the region's endpoints ifTOl . 

An alternative to using chords of the curves is to apply smoothing to the points on the curve. 
Comers can be defined as points with a high rate of change of slope ifTTl . or points where the 
curvature decreases rapidly to the nearest minima and the angle to the neighbouring maxima is 
small U^. 

A fixed smoothing scale is not necessarily appropriate for all curves, so comers can also 
be detected at high curvature points which have stable positions under a range of smoothing 
scales lfT3l . As smoothing is decreased, curvature maxima bifurcate, forming a tree over scale. 
Branches of a tree which are longer (in scale) than the parent branch are considered as stable 
comer points lfT4ll . Instead of Gaussian smoothing, extrema of the wavelet transforms of the 
slope [[T5l or wavelet transform modulus maximum of the angle lfT6l . [fTTl over multiple scales 
can be taken to be corners. 

The smoothing scale can be chosen adaptively. The Curvature Scale Space technique [[TSl 
uses a scale proportional to the length and defines comers at maxima of curvature where the 
maxima are significantly larger than the closest minima. Locally adaptive smoothing using 
anisotropic diffusion |fT9l or smoothing scaled by the local variance of curvature [|20ll have 
also been proposed. 

Instead of direct smoothing, edges can be parameterised with cubic splines and comers detected 
at points of high second derivative where the spline deviates a long way from the control 

point lEn, Ea. 

A different approach is to extend curves past the endpoints by following saddle minima or ridge 
maxima in the gradient image until a nearby edge is crossed, thereby finding junctions ll23l . Since 
the chain code number corresponds roughly to slope, approximate curvature can be found using 
finite differences, and corners can be found by identifying specific pattems [|24|. Histograms of 
the chain code numbers on either side of the candidate point can be compared using normalized 
cross correlation and corners can be found at small local minima [|25l . Also, a measure of the 
slope can be computed using circularly smoothed histograms of the chain code numbers [J26||. 
Points can be classified as comers using fuzzy rules applied to measures computed from the 
forward and backward arm and the curve angle lITTIl . 
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b) Edgel based corner detectors: Chained edge techniques rely on the method used to 
perform segmentation and edge chaining, so many techniques find edge points (edgels) and 
examine the local edgels or image to find comers. 

For instance, each combination of presence or absence of edgels in a 3 x 3 window can be 
assigned a curvature, and comers found as maxima of curvature in a local window ll28l . Corners 
can be also found by analysing edge properties in the window scanned along the edge [|29ll . A 
generalized Hough transform ll30l can be used which replaces each edgel with a line segment, 
and comers can be found where lines intersect, i.e. at large maxima in Hough space OTIl . In 
a manner similar to chaining, a short line segment can be fitted to the edgels, and the comer 
strength found by the change in gradient direction along the line segment ll32l . Edge detectors 
often fail at junctions, so corners can be defined as points where several edges at different 
angles end nearby ll33l . By finding both edges and their directions, a patch on an edge can 
be compared to patches on either side in the direction of the contour, to find points with low 
self-similarity ll34l . 

Rapid changes in the edge direction can be found by measuring the derivative of the gradient 
direction along an edge and multiplying by the magnitude of the gradient: 

„ 9xx9y + 9yy9x ~ '^9xy9x9y . ^ , 

9x9t 



where, in general. 



_dg_ _ d^g 

9x ) 9 XX r% 9 5 CIC. 

OX OX'^ 



and g is either the image or a bivariate polynomial fitted locally to the image [|35l . Ck can also 
be multiplied by the change in edge direction along the edge ll36l . 

Corner strength can also be computed as rate of change in gradient angle when a bicubic 
polynomial is fitted to the local image surface ll37l . Il38l : 



CxCyCxy ~h 



Cz = -2^ , (2) 

where, for example, c^y is the coefficient of xy in the fitted polynomial. If edgels are only 
detected at the steepest part of an edge, then a score computing total image curvature at the 
edgels is given by: 

Cw = V^I-S\\/I\\ (3) 
where V/ is the image gradient [[39l . 
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2) Greylevel derivative based detectors: The assumption that corners exist along edges is 
an inadequate model for patches of texture and point like features, and is difficult to use at 
junctions. Therefore a large number of detectors operate directly on greylevel images without 
requiring edge detection. 

One of the earliest detectors [|40ll defines corners to be local extrema in the determinant of 
the Hessian: 

CdET = \n[l] I = L^Jyy - [hyf. (4) 

This is frequently referred to as the DET operator. Cdet moves along a line as the scale changes. 
To counteract this, DET extrema can be found two scales and connected by a line. Corners are 
then taken as maxima of the Laplacian along the line [|4T1l . 

Instead of DET maxima, comers can also be taken as the gradient maxima on a line connecting 
two nearby points of high Gaussian curvature of opposite sign where the gradient direction 
matches the sign change ||42|| . By considering gradients as elementary currents, the magnitude 
of the corresponding magnetic vector potential can be computed. The gradient of this is taken 
normal and orthogonal to the local contour direction, and the corner strength is the multiple of 
the magnitude of these ll43l . 

a ) Local SSD ( Sum of Squared Differences ) detectors: Features can be defined as points 
with low self- similarity in all directions. The self-similarity of an image patch can be measured 
by taking the SSD between an image patch and a shifted version of itself ll44l . This is the basis for 
a large class of detectors. Harris and Stephens [|45l built on this by computing an approximation 
to the second derivative of the SSD with respect to the shift. This is both computationally more 
efficient and can be made isotropic. The result is: 



H 



P II 

TY p 



(5) 



where ^ denotes averaging performed over the area of the image patch. Because of the wording 
used in ll45l . it is often mistakenly claimed that H is equal to the negative second derivative of the 
autocorrelation. This is not the case because the SSD is equal to the sum of the autocorrelation 
and some additional terms ll46ll . 

The earlier Forstner ll47l algorithm is easily easily explained in terms of H. For a more 
recently proposed detector [|48l . it has been shown shown [|49l that under affine motion, it is 
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better to use the smallest eigenvalue of H as the corner strength function. A number of other 
suggestions ll45l . lISOl . P9l . [[STI have been made for how to compute the comer strength from 
H, and these have been shown to all be equivalent to various matrix norms of H ll52l . H can be 
generalized by generalizing the number of channels and dimensionality of the image [|53l and 
it can also be shown that that ll47l . [|49ll . [|54ll are equivalent to specific choices of the measure 
used in [[STTl . 

H can be explained in terms of the first fundamental form of the image surface Il55ll . From 
analysis of the second fundamental form, a new detector is proposed which detects points where 
the probability of the surface being hyperbolic is high. 

Instead of local SSD, general template matching, given a warp, appearance model and point- 
wise comparison which behaves similarly to the SSD (sum of squared differences) for small 
differences can be considered ll56ll . The stability with respect to the match parameters is derived, 
and the result is a generalization of H (where H is maximally stable for no appearance model, 
linear translation and SSD matching). This is used to derive detectors which will give points 
maximally stable for template matching, given similarity transforms, illumination models and 
prefiltering. 

b) Laplacian based detectors: An alternative approach to the problem of finding a scalar 
value which measures the amount of second derivative is to take the Laplacian of the image. 
Since second derivatives greatly amplify noise, the noise is reduced by using the smoothed 
Laplacian, which is computed by convolving the image with the LoG (Laplacian of a Gaussian). 
Since the LoG kernel is symmetric, one can interpret this as performing matched filtering for 
features which are the same shape as a LoG. As a result, the variance of the Gaussian determines 
the size of features of interest. It has been noted [|57l that the locations of maxima of the LoG 
over different scales are particularly stable. 

Scale invariant comers can be extracted by convolving the image with a DoG (Difference of 
Gaussians) kernel at a variety of scales (three per octave) and selecting local maxima in space 
and scale ll58l . DoG is a good approximation for LoG and is much faster to compute, especially 
as the intermediate results are useful for further processing. To reject edge-like features, the 
eigenvalues of the Hessian of the image are computed and features are kept if the eigenvalues 
are sufficiently similar (within a factor of 10). This method can be contrasted with where 
the Laplacian is compared to the magnitude of the edge response. If two scales per octave 
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are satisfactory, then a significant speed increase can be achieved by using recursive filters to 
approximate Gaussian convolution ||59l . 

Harris-Laplace II6OII features are detected using a similar approach. An image pyramid is built 
and features are detected by computing C/^ at each layer of the pyramid. Features are selected 
if they are a local maximum of Ch in the image plane and a local maxima of the LoG across 
scales. 

Recently, scale invariance has been extended to consider features which are invariant to affine 
transformations [[57ll . II6TI . [|62ll . Il63l . However, unlike the 3D scale space, the 6D affine space 
is too large to search, so all of these detectors start from comers detected in scale space. These 
in turn rely on 2D features selected in the layers of an image pyramid. 

3) Direct greylevel detectors: Another major class of comer detectors work by examining a 
small patch of an image to see if it "looks" like a corner. The detectors described in this paper 
belong in this section. 

a) Wedge model detectors: A number of techniques assume that a comer has the general 
appearance of one or more wedges of a uniform intensity on a background of a different uniform 
intensity. For instance a corner can be modelled as a single [[64l or family ll65l of blurred wedges 
where the parameters are found by fitting a parametric model. The model can include angle, 
orientation, contrast, bluntness and curvature of a single wedge [|66ll . In a manner similar to [67], 
convolution masks can be derived for various wedges which optimize signal to noise ratio and 
localisation error, under assumption that the image is cormpted by Gaussian noise [68J. 

It is more straightforward to detect wedges in binary images and to get useful results, local 
thresholding can be used to binarize the image ll69l . If a corner is a bilevel wedge, then a 
response function based on local Zemike moments can be used to detect comers [TTOII . A more 
direct method for finding wedges is to find points where where concentric contiguous arcs of 
pixels are significantly different from the centre pixel [71]. According to the wedge model, a 
comer will be the intersection of several edges. An angle-only Hough transform fn\ is performed 
on edgels belonging to lines passing through a candidate point to find their angles and hence 
detect corners fTBl . Similar reasoning can be used to derive a response function based on gradient 
moments to detect V-, T- and X- shaped comers 11741 . The strength of the edgels, wedge angle 
and dissimilarity of the wedge regions has also been used to find comers ll75l . 
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b) Self dissimilarity: The tip of a wedge is not self-similar, so this can be generalized by 
defining corners as points which are not self-similar. The proportion of pixels in a disc around 
a centre (or nucleus) which are similar to the centre is a measure of self similarity. This is 
the USAN (univalue segment assimilating nucleus). Corners are defined as SUSAN (smallest 
USAN, i.e. local minima) points which also pass a set of rules to suppress qualitatively bad 
features. In practice, a weighted sum of the number of pixels inside a disc whose intensity is 
within some threshold of the centre value is used [76]. COP (Crosses as Oriented Pair) [[77| 
computes dominant directions using local averages USANs of a pair of oriented crosses, and 
define comers as points with multiple dominant directions. 

Self similarity can be measured using a circle instead of a disc iTTSll . The SSD between the cen- 
ter pixel and the pixels at either end of a diameter line is an oriented measure of self-dissimilarity. 
If this is small in any orientation then the point is not a comer. This is computationally efficient 
since the process can be stopped as soon as one small value is encountered. This detector is 
also used by [1791 with the additional step that the difference between the centre pixel and circle 
pixels is used to estimate the Laplacian, and points are also required to be locally maximal in 
the Laplacian. 

Small regions with a large range in grey values can be used as corners. To find these efficiently, 
the image can be projected on to the x and y axes and large peaks found in the second 
derivatives. Candidate corner locations are the intersections of these maxima projected back 
in to the image [80]- Paler et. al. ifSTI proposes self similarity can be measured by comparing 
the centre pixel of a window to the median value of pixels in the window. In practice, several 
percentile values (as opposed to just the 50*) are used. 

Self-dissimilar patches will have a high energy content. Composing two orthogonal quadrature 
pair Gabor filters gives oriented energy. Comers are maxima of total energy (the sum of oriented 
energy over a number of directions) lf82l . 

A fast radial symmetry transform is developed in lf83l to detect points. Points have a high 
score when the gradient is both radially symmetric, strong, and of a uniform sign along the 
radius. The detected points have some resemblance DoG features. 

c) Machine learning based detectors: All the detectors described above define comers using 
a model or algorithm and apply that algorithm directly to the image. An alternative is to train 
a classifier on the model and then apply the classifier to the image. For instance, a multilayer 



October 14, 2008 



DRAFT 



10 



perception can be trained on example comers from some model and applied to the image after 
some processing lf84l . If85l . 

Human perception can be used instead of a model lf86l : images are shown to a number of 
test subjects. Image locations which are consistently fixated on (as measured by an eye tracking 
system) are taken to be interesting, and a support vector machine is trained to recognize these 
points. 

If a classifier is used, then it can be trained according to how a comer should behave, i.e. 
that its performance in a system for evaluating detectors should be maximized. Trujillo and 
Olague ll87l state that detected points should have a high repeatability (as defined by [1]), be 
scattered uniformly across the image and that there should be at least as many points detected 
as requested. A corner detector function is optimized (using genetic programming) to maximize 
the score based on these measures. 

The FAST-n detector (described in Section HIH) is related to the wedge-model style of detector 
evaluated using a circle surrounding the candidate pixel. To optimize the detector for speed, this 
model is used to train a decision tree classifier and the classifier is applied to the image. The 
FAST-ER detector (described in Section jV]) is a generalization which allows the detector to be 
optimized for repeatability. 

B. Comparison of feature detectors 

Considerably less work has been done on comparison and evaluation of feature detectors than 
on inventing new detectors. The tests fall into three broad categoriei3: 

1) Corner detection as object recognition. Since there is no good definition of exactly what a 
comer should look like, algorithms can be compared using simplistic test images where the 
performance is evaluated (in terms of true positives, false positives, etc. . . ) as the image is 
altered using contrast reduction, warps and added noise. Since a synthetic image is used, 
comers exist only at known locations, so the existence of false negatives and false positives 
is well defined. However, the method and results do not generalize to natural images. 

2) System performance. The performance of an application (often tracking) is evaluated as the 
comer detector is changed. The advantage is that it tests the suitability of detected corners 

'Tests for the localisation accuracy are not considered here since for most applications the presence or absence of useful 
comers is the limiting factor 
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for further processing. However, poor results would be obtained from a detector ill matched 
to the downstream processing. Furthermore the results do not necessarily generalize well 
to other systems. To counter this, sometimes part of a system is used, though in this case 
the results do not necessarily apply to any system. 
3) Repeatability. This tests whether corners are detected from multiple views. It is a low level 
measure of corner detector quality and provides an upper bound on performance. Since it is 
independent of downstream processing, the results are widely applicable, but it is possible 
that the detected features may not be useful. Care must be used in this technique, since 
the trivial detector which identifies every pixel as a comer achieves 100% repeatability. 
Furthermore, the repeatability does not provide information about the usefulness of the 
detected comers for further processing. For instance, the brightest pixels in the image are 
likely to be repeatable but not especially useful. 
In the first category. Raj an and Davidson ll88l produce a number of elementary test images 
with a very small number of corners (1 to 4) to test the performance of detectors as various 
parameters are varied. The parameters are comer angle, comer arm length, comer adjacency, 
comer sharpness, contrast and additive noise. The positions of detected comers are tabulated 
against the actual corner positions as the parameters are varied. Cooper et. al. Il34l . [|89ll use 
a synthetic test image consisting of regions of uniform intensity arranges to create L-, T-, Y- 
and X-shaped comers. The pattem is repeated several times with decreasing contrast. Finally, 
the image is blurred and Gaussian noise is added. Chen et. al. [[85l use a related method. A 
known test pattern is subjected to a number random affine warps and contrast changes. They 
note that this is naive, but tractable. They also provide an equivalent to the ROC (Receiver 
Operating Characteristic) curve. Zhang et. al. Il90ll generate random comers according to their 
model and plot localization error, false positive rate and false negative rate against the detector 
and generated corner parameters. Luo et. al. Il43l use an image of a carefully constructed scene 
and plot the proportion of true positives as the scale is varied and noise is added for various 
comer angles. 

Mohanna and Mokhtarian [|9T1l evaluate performance using several criteria. Firstly, they define 
a consistent detector as one where the number of detected corners does not vary with various 
transforms such as addition of noise and affine warping. This is measured by the 'consistency 
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of comer numbers' (CCN): 

CCA^ = 100 X 1.1-1"*-"°!, (6) 

where rij is the number of features in the transformed image and Uo is the number of features in 
the original image. This test does not determine the quality of the detected comers in any way, 
so they also propose measuring the accuracy (ACU) as: 

ACU = 100 X "° ^ (7) 

where rio is the number of detected corners, is the number of so-called 'ground truth' corners 
and Ua is the number of detected corners which are close to ground truth comers. Since real 
images are used, there is no good definition of ground truth, so a number of human test subjects 
(e.g. 10) familiar with corner detection in general, but not the methods under test, label corners 
in the test images. Corners which 70% of the test subjects agree on are kept as ground truth 
comers. This method unfortunately relies on subjective decisions. 

Remarkably, of the systems above, only lf85l . If88l and lf86l provide ROC curves (or equivalent): 
otherwise only a single point (without consistency on either axis of the ROC graph) is measured. 

In the second category, Trajkovic and Hedley [TTSll define stability as the number of 'strong' 
matches, matches detected over three frames in their tracking algorithm, divided by the total 
number of corners. Tissainayagama and Suterb [[92] use a similar method: a comer in frame n 
is stable if it has been successfully tracked from frame 1 to frame n. Bae et. al. fTl\ detect 
optical flow using cross correlation to match corners between frames and compare the number 
of matched corners in each frame to the number of corners in the first frame. 

To get more general results than provided by system performance, the performance can be 
computed using only one part of a system. For instance, Mikolajczyk and Schmid [|93l test a large 
number of interest point descriptors and a small number of closely related detectors by computing 
how accurately interest point matching can be performed. Moreels and Perona ||94ll perform 
detection and matching experiments across a variety of scene types under a variety of lighting 
conditions. Their results illustrate the difficulties in generalizing from system performance since 
the best detector varies with both the choice of descriptor and lighting conditions. 

In the third category, Schmid et. al. [HI propose that when measuring reliability, the important 
factor is whether the same real-world features are detected from multiple views. For an image 
pair, a feature is 'detected' if it is extracted in one image and appears in the second. It is 
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'repeated' if it is also detected nearby in the second. The repeatability is the ratio of repeated 
features to detected features. They perform the tests on images of planar scenes so that the 
relationship between point positions is a homography. Fiducial markers are projected onto the 
planar scene using an overhead projector to allow accurate computation of the homography. 
To measure the suitability of interest points for further processing, the information content of 
descriptors of patches surrounding detected points is also computed. 

in. High-speed corner detection 

A. FAST: Features from Accelerated Segment Test 

The segment test criterion operates by considering a circle of sixteen pixels around the comer 
candidate p. The original detector ||95l . ||96ll classifies p as a comer if there exists a set of n 
contiguous pixels in the circle which are all brighter than the intensity of the candidate pixel Ip 
plus a threshold t, or all darker than Ip — t, as illustrated in Figure [H n was originally chosen to 
be twelve because it admits a high-speed test which can be used to exclude a very large number 
of non-corners. The high-speed test examines pixels 1 and 9. If both of these are within t if 
Ip, then p can not be a comer. If p can still be a corner, pixels 5 and 13 are examined. If p is 
a comer then at least three of these must all be brighter than Ip + t or darker than Ip — t. If 
neither of these is the case, then p cannot be a comer. The full segment test criterion can then 
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be applied to the remaining candidates by examining all pixels in the circle. This detector in 
itself exhibits high performance, but there are several weaknesses: 

1) This high-speed test does not reject as many candidates for n < 12, since the point can be a 
comer if only two out of the four pixels are both significantly brighter or both significantly 
darker than p (assuming the pixels are adjacent). Additional tests are also required to find 
if the complete test needs to be performed for a bright ring or a dark ring. 

2) The efficiency of the detector will depend on the ordering of the questions and the 
distribution of corner appearances. It is unlikely that this choice of pixels is optimal. 

3) Multiple features are detected adjacent to one another. 

B. Improving generality and speed with machine learning 

Here we expand on the work first presented in ll97l and present an approach which uses 
machine learning to address the first two points (the third is addressed in Section IIII-CI) . The 
process operates in two stages. First, to build a corner detector for a given n, all of the 16 
pixel rings are extracted a set of images (preferably from the target application domain). These 
are labelled using a straightforward implementation of the segment test criterion for n and a 
convenient threshold. 

For each location on the circle x G {1 ... 16}, the pixel at that position relative to p, denoted 
hy p ^ X, can have one of three states: 



d, Ip->x < Ip — t (darker) 

s, Ip-t < Ip^.j: < Ip + t (similar) (8) 



b, Ip + t< Ip_x (brighter) 
Let P be the set of all pixels in all training images. Choosing an x partitions P into three 
subsets, Prf, Ps and P^, where: 

P, = {peP: Sp^, = b}, (9) 

and Pd and Pg are defined similarly. In other words, a given choice of x is used to partition 
the data in to three sets. The set P^ contains all points where pixel x is darker than the center 
pixel by a threshold t, Pf, contains points brighter than the centre pixel by t, and Ps contains 
the remaining points where pixel x is similar to the centre pixel. 
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Let Kp be a boolean variable which is true if p is a corner and false otherwise. Stage 2 
employs the algorithm used in IDS ||98l and begins by selecting the x which yields the most 
information about whether the candidate pixel is a corner, measured by the entropy of Kp. 

The total entropy of K for an arbitrary set of comers, Q, is: 

H{Q) = (c + c) log2(c + c) - clog2 c - clog2 c (10) 
where c = \ {i E Q : Ki is true}| (number of corners) 
and c = \{i E Q : Ki is false} | (number of non corners) 

The choice of x then yields the information gain (Hg): 

Hg = H{P) - H{P,) - H{P,) - H{P,) (11) 

Having selected the x which yields the most information, the process is applied recursively on 
all three subsets i.e. x^ is selected to partition Pf, in to Pb^d, Pb,s, Pb.b, Xs is selected to partition 
Pg in to d, Fg Pg^b and so on, where each x is chosen to yield maximum information about 
the set it is applied to. The recursion process terminates when the entropy of a subset is zero. 
This means that all p in this subset have the same value of Kp, i.e. they are either all comers or 
all non-comers. This is guaranteed to occur since K is an exact function of the data. In summary, 
this procedure creates a decision tree which can correctly classify all corners seen in the training 
set and therefore (to a close approximation) correctly embodies the mles of the chosen FAST 
corner detector. 

In some cases, two of the three subtrees may be the same. In this case, the boolean test which 
separates them is removed. This decision tree is then converted into C code, creating a long 
string of nested if-else statements which is compiled and used as a corner detector. For highest 
speed operation, the code is compiled using profile guided optimizations which allow branch 
prediction and block reordering optimizations. 

For further optimization, we force Xb, Xd and Xs to be equal. In this case, the second pixel 
tested is always the same. Since this is the case, the test against the first and second pixels can 
be performed in batch. This allows the first two tests to be performed in parallel for a strip 
of pixels using the vectorizing instructions present on many high performance microprocessors. 
Since most points are rejected after two tests, this leads to a significant speed increase. 
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Note that since the data contains incomplete coverage of all possible corners, the learned 
detector is not precisely the same as the segment test detector. In the case of the FAST-n 
detectors, it is straightforward to include an instance of every possible combination of pixels 
(there are 3^^ = 43, 046, 721 combinations) with a low weight to ensure that the learned detector 
exactly computes the segment test cirterion. 

C. Non-maximal suppression 

Since the segment test does not compute a comer response function, non maximal suppression 
can not be applied directly to the resulting features. For a given n, as t is increased, the number 
of detected corners will decrease. Since n = 9 produces the best repeatability results (see 
Section IVIl) . variations in n will not be considered. The corner strength is therefore defined to 
be the maximum value of t for which a point is detected as a corner. 

The decision tree classifier can efficiently determine the class of a pixel for a given value 
of t. The class of a pixel (for example, 1 for a corner, for a non-comer) is a monotonically 
decreasing function of t. Therefore, we can use bisection to efficiently find the point where the 
function changes from 1 to 0. This point gives us the largest value of t for which the point is 
detected as a corner. Since t is discrete, this is the binary search algorithm. 

Alternatively, an iteration scheme can be used. A pixel on the ring 'passes' the segment test 
if it is not within t of the centre. If enough pixels fail, then the point will not be classified as a 
corner. The detector is run, and of all the pixels which pass the test, the amount by which they 
pass is found. The threshold is then increased by the smallest of these amounts, and the detector 
is rerun. This increases the threshold just enough to ensure that a different path is taken through 
the tree. This process is then iterated until detection fails. 

Because the speed depends strongly on the learned tree and the specific processor architecture, 
neither technique has a definitive speed advantage over the other. Non maximal suppression is 
performed in a 3 x 3 mask. 

IV. Measuring detector repeatability 

For an image pair, a feature is 'useful' if it is extracted in one image and can potentially appear 
in the second (i.e. it is not occluded). It is 'repeated' if it is also detected nearby the same real 
world point in the second. For the purposes of measuring repeatability this allows several features 
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Detect features in frame 1 Detect features in frame 2 




Fig. 2 

Repeatability is tested by checking if the same real-world features are detected in different views. A 

GEOMETRIC MODEL IS USED TO COMPUTE WHERE THE FEATURES REPROJECT TO. 



in the first image to match a single feature in the second image. The repeatability, R, is defined 
to be 

J-, -^repeated 

^=-r^ , (12) 

JVuseful 

where A'lepeated and A'usefui are summed over all image pairs in an image sequence. This is 
equivalent to the weighted average of the repeatabilities for each image pair, where the weighting 
is the number of useful features. In this paper, we generally compute the repeatability for a given 
number of features per frame, varying between zero and 2000 features (for a 640 x 480 image). 
This also allows us to compute the area under the repeatability curve. A, as an aggregate score. 

The repeatability measurement requires the location and visibility of every pixel in the first 
image to be known in the second image. In order to compute this, we use a 3D surface model 
of the scene to compute if and where where detected features should appear in other views. 
This is illustrated in Figure [2l This allows the repeatability of the detectors to be analysed on 
features caused by geometry such as corners of polyhedra, occlusions and junctions. We also 
allow bas-relief textures to be modelled with a flat plane so that the repeatability can be tested 
under non-affine warping. 

The definition of 'nearby' above must allow a small margin of error (e pixels) because the 
alignment, the 3D model and the camera calibration (especially the radial distortion) is not 
perfect. Furthermore, the detector may find a maximum on a slightly different part of the comer. 
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box dataset: photographs taken of a test rig (consisting of photographs pasted to the inside of a 
cuboid) with strong changes of perspective, changes in scale and large amounts of radial distortion. 

This tests the corner detectors on planar texture. 



This becomes more likely as the change in viewpoint and hence change in shape of the corner 
become large. 

Instead of using fiducial markers, the 3D model is aligned to the scene by hand and this is 
then optimised using a blend of simulated annealing and gradient descent to minimise the SSD 
(sum of squared differences) between all pairs of frames and reprojections. To compute the SSD 
between frame i and reprojected frame j, the position of all points in frame j are found in frame 
i. The images are then bandpass filtered. High frequencies are removed to reduce noise, while 
low frequencies are removed to reduce the impact of lighting changes. To improve the speed of 
the system, the SSD is only computed using 1000 random locations. 

The datasets used are shown in Figure [3l Figure |4] and Figure [51 With these datasets, we have 
tried to capture a wide range of geometric and textural corner types. 

V. FAST-ER: ENHANCED Repeatability 

Since the segment test detector can be represented as a ternary decision tree and we have 
defined repeatability, the detector can be generalized by defining a feature detector to be a 
ternary decision tree which detects points with high repeatability. The repeatability of such a 
detector is a non-convex function of the configuration of the tree, so we optimize the tree using 
simulated annealing. This results in a multi-objective optimization. If every point is detected as 
a feature, then the repeatability is trivially perfect. Also, if the tree complexity is allowed to 
grow without bound, then the optimization is quite capable of finding one single feature in each 
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Fig. 4 

Maze dataset: photographs taken of a prop used in an augmented reality application. This set consists 
OF textural features undergoing projective warps as well as geometric features. There are also 

significant changes of scale. 




Fig. 5 

Bas-relief DATASET: the model is a flat plane, but there are many objects with significant relief. This 

CAUSES the appearance OF FEATURES TO CHANGE IN A NON AFFINE WAY FROM DIFFERENT VIEWPOINTS. 



image in the training set which happens to be repeated. Neither of these are useful results. To 
account for this, the cost function for the tree is defined to be: 




where r is the repeatability (as defined in ^V2\i). di is the number of detected corners in frame i, 
N is the number of frames and s is the size (number of nodes) of the decision tree. The effect of 
these costs are controlled by Wr, Wn, and Ws- Note that for efficiency, repeatability is computed 
at a fixed threshold as opposed to a fixed number of features per frame. 

The comer detector should be invariant to rotation, reflection and intensity inversion of the 
image. To prevent excessive burden on the optimization algorithm, each time the tree is evaluated, 
it is applied sixteen times: at four rotations, 90° apart, with all combinations of reflection and 
intensity inversion. The result is the logical OR of the detector applications: a corner is detected 
if any one of the sixteen applications of the tree classifies the point as a comer. 
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Fig. 6 

Positions of offsets used in the FAST-ER detector. 



Each node of the tree has an offset relative to the centre pixel, x, with x G {0 . . . 47} as 
defined in Figure [6l Therefore, x = refers to the offset (—1,4). Each leaf has a class K, with 
for non-corners and 1 for comers. Apart from the root node, each node is either on a 6, d or 
s branch of its parent, depending on the test outcome which leads to that branch. The tree is 
constrained so that each leaf on an s branch of its direct parent has K = 0. This ensures that 
the number of comers generally decreases as the threshold is increased. 

The simulated annealing optimizer makes random modifications to the tree by first selecting 
a node at random and then mutating it. If the selected node is: 

• a leaf, then with equal probability, either: 

1) Replace node with a random subtree of depth 1. 

2) Flip classification of node. This choice is not available if the leaf class is constrained. 

• a node, then with equal probability, choose any one of: 

1) Replace the offset with a random value in ... 47. 

2) Replace the node with a leaf with a random class (subject to the constraint). 

3) Remove a randomly selected branch of the node and replace it with a copy of another 
randomly selected branch of that node. For example, a b branch may be replaced with 
a copy of an s branch. 

The randomly grown subtree consists of a single decision node (with a random offset in . . . 47), 
and three leaf nodes. With the exception of the constrained leaf, the leaves of this random 
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subtree have random classes. These modifications to the tree allow growing, mutation, mutation 
and shrinking of the tree, respectively. The last modification of the tree is motivated by our 
observations of the FAST-9 detector. In FAST-9, a large number of nodes have the characteristic 
that two out of the three subtrees are identical. Since FAST-9 exhibits high repeatability, we 
have included this modification to allow FAST-ER to easily learn a similar structure. 

The modifications are accepted according to the Boltzmann acceptance criterion, where the 
probability P of accepting a change at iteration / is: 

P = (14) 

where k is the cost after application of the acceptance criterion and T is the temperature. The 
temperature follows an exponential schedule: 

T = /?e-°7i^, (15) 

where /max is the number of iterations. The algorithm is initialized with a randomly grown tree of 
depth 1, and the algorithm uses a fixed threshold, t. Instead of performing a single optimization, 
the optimizer is rerun a number of times using different random seeds. 

Because the detector must be applied to the images every iteration, each candidate tree in all 
sixteen transformations is compiled to machine code in memory and executed directly. Since it 
is applied with sixteen transformations, the resulting detector is not especially efficient. So for 
efficiency, the detector is used to generate training data so that a single tree can be generated 
using the method described in Section IIII-BI The resulting tree contains approximately 30,000 
non-leaf nodes. 

A. Parameters and justification 

The parameters used for training are given in Table HI The entire optimization which consists of 
100 repeats of a 100,000 iteration optimization requires about 200 hours on a Pentium 4 at 3GHz. 
Finding the optimal set of parameters is essentially a high dimensional optimization problem, 
with many local optima. Furthermore, each evaluation of the cost function is very expensive. 
Therefore, the values are in no sense optimal, but they are a set of values which produce good 
results. Refer to [|99l for techniques for choosing parameters of a simulated annealing based 
optimizer. Recall that the training set consists of only the first three images from the 'box' 
dataset. 
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Training set 


'box' set, images 0-2. 



TABLE I 

Parameters used to optimize the tree. 



The weights determine the relative effects of good repeatability, resistance to overfitting and 
corner density, and therefore will affect the performance of the resulting corner detector. To 
demonstrate the sensitivity of the detector with respect to «v, Wn and Wg a detector was 
learned for three different values of each, w,- G {0.5,1,2}, Wn G {1750,5300,7000} and 
Wg E {5000, 10000, 20000}, resulting in a total of 27 parameter combinations. The performance 
of the detectors are evaluated by computing the mean area under the repeatability curve for 
the 'box', 'maze' and 'bas-relief datasets. Since in each of the 27 points, 100 runs of the 
optimization are performed, each of the 27 points produces a distribution of scores. The results 
of this are shown in Figure Ul The variation in score with respect to the parameters is quite low 
even though the parameters all vary by a factor of four. Given that, the results for the set of 
parameters in Table U are very close to the results for the best tested set of parameters. This 
demonstrates that the choices given in Table IJ are reasonable. 

VL Results 

In this section, the FAST and FAST-ER detectors are compared against a variety of other 
detectors both in terms of repeatability and speed. In order to test the detectors further, we have 
used the 'Oxford' dataset HI 0011 in addition to our own. This dataset models the warp between 
images using a homography, and consists of eight sequences of six images each. It tests detector 
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Fig. 7 

Distribution of scores for various parameters of [wr, w„,Ws). The parameters leading to the best result 
ARE (2.0, 3500, 5000) and the parameters for the worst point are (0.5, 3500, 5000). For comparison, the 

DISTRIBUTION FOR ALL 27 RUNS AND THE MEDIAN POINT (GIVEN IN TABLe[I1| ARE GIVEN. THE SCORE GIVEN IS THE 
MEAN VALUE OF A COMPUTED OVER THE 'BOX', 'MAZE' AND 'BAS-RELIEF' DATASETS. 



repeatability under viewpoint changes (for approximately planar scenes), lighting changes, blur 
and JPEG compression. Note that the FAST-ER detector is trained on 3 images (6 image pairs), 
and is tested on a total of 85 images (688 image pairs). 

The parameters used in the various detectors are given in Table |IIl In all cases (except SUSAN, 
which uses the reference implementation in HIOIH ). non-maximal suppression is performed using 
a 3 X 3 mask. The number of features was controlled in a manner equivalent to thresholding 
on the response. For the Harris-Laplace detector, the Harris response was used, and for the 
SUSAN detector, the 'distance threshold' parameter was used. It should be noted that some 
experimentation was performed on all the detectors to find the best results on our dataset. In 
the case of FAST-ER, the best detector was selected. The parameters were then used without 
modification on the 'Oxford' dataset. The timing results were obtained with the same parameters 
used in the repeatability experiment. 

A. Repeatability 

The repeatability is computed as the number of comers per frame is varied. For comparison 
we also include a scattering of random points as a baseline measure, since in the limit if every 
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Scales per octave 3 

Initial blur a 0.8 
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Blur a 



General parameters 



2.5 



Distance threshld 4.0 

Harris-Laplace 

Initial blur a 0.8 

Harris blur 3 
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Scales per octave 10 



z 5 pixels 

TABLE II 

Parameters used for testing corner detectors. 



Detector 


A 


FAST-ER 


1313.6 


FAST-9 


1304.57 


DoG 


1275.59 


Shi & Tomasi 


1219.08 


Harris 


1195.2 


Harris-Laplace 


1153.13 


FAST- 12 


1121.53 


SUSAN 


1116.79 


Random 


271.73 



TABLE III 

Area under repeatability curves for 0-2000 corners per frame averaged over all the evaluation 

datasets (except the additive noise). 



pixel is detected as a comer, then the repeatability is 100%. To test robustness to image noise, 
increasing amounts of Gaussian noise were added to the bas-relief dataset, in addition to the 
significant amounts of camera noise already present. Aggregate results taken over all datasets 
are given in Table Unl It can be seen from this that on average, FAST-ER outperforms all the 
other tested detectors. 

More detailed are shown in Figures [8l [TO] and [TT] As shown in Figure [8] , FAST-9 performs 



October 14, 2008 



DRAFT 



25 



0.8 



N=9 +- 
N=10 X 
+ N=12 * 



0.7 




0.6 



-x-x->:N=12 □ 

r,^- N=13 

. N=14 > 

^: A N=15 • 

N=16 A 



0.5 



0.4 



Q. 

0) 



0.3 



DC 



0.2 



0.1 




500 1000 1500 2000 



Comers per frame 



Fig. 8 



A COMPARISON OF THE FAST-n DETECTORS ON THE 'BAS-RELIEF' SHOWS THAT 71 = 9 IS THE MOST REPEATABLE. FOR 



best (FAST-8 and below are edge detectors), so only FAST-9 and FAST- 12 (the original FAST 
detector) are given. 

The FAST-9 feature detector, despite being designed only for speed, generally outperforms all 
but FAST-ER on these images. FAST-n, however, is not very robust to the presence of noise. 
This is to be expected. High speed is achieved by analysing the fewest pixels possible, so the 
detector's ability to average out noise is reduced. 

The best repeatability results are achieved by FAST-ER. FAST-ER easily outperforms FAST-9 
in all but Figures [TOK. [TTb. C and E. These results are slightly more mixed, but FAST-ER still 
performs very well for higer corner densities. FAST-ER greatly outperforms FAST-9 on the noise 
test, (and outperforms all other detectors for a < 7). This is because the training parameters bias 
the detector towards detecting more corners for a given threshold than FAST-9. Consequently, 
for a given number of features per frame, the threshold is higher, so the effect of noise will be 
reduced. 

As the number of comers per frame is increased, all of the detectors, at some point, suffer from 
decreasing repeatability. This effect is least pronounced with the FAST-ER detector. Therefore, 
with FAST-ER, the comer density does not need to be as carefully chosen as with the other 



n < 8, THE DETECTOR STARTS TO RESPOND STRONGLY TO EDGES. 
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Fig. 9 

Repeatability results for the bas-relief data set (at 500 features per frame) as the amount of Gaussian 

NOISE ADDED TO THE IMAGES IS VARIED. SEE FIGURe[T0]fOR THE KEY. 



detectors. This fall-off is particularly strong in the Harris and Shi-Tomasi detectors. Shi and 
Tomasi, derive their result for better feature detection on the assumption that the deformation of 
the features is affine. Their detector performs slightly better over all, and especially in the cases 
where the deformations are largely affine. For instance, in the bas-relief dataset (Figure [TOt). 
this assumption does not hold, and interestingly, the Harris detector outperforms Shi and Tomasi 
detector in this case. Both of these detectors tend to outperform all others on repeatability for 
very low comer densities (less than 100 corners per frame). 

The Harris-Laplace is detector was originally evaluated using planar scenes [|60ll . [I102II . 
he results show that Harris -Laplace points outperform both DoG points and Harris points in 
repeatability. For the box dataset, our results verify that this is correct for up to about 1000 
points per frame (typical numbers, probably commonly used); the results are somewhat less 
convincing in the other datasets, where points undergo non-projective changes. 

In the sample implementation of SIFT [I103II . approximately 1000 points are generated on the 
images from the test sets. We concur that this a good choice for the number of features since 
this appears to be roughly where the repeatability curve for DoG features starts to flatten off. 

Smith and Brady fT6\ claim that the SUSAN comer detector performs well in the presence 
of noise since it does not compute image derivatives and hence does not amplify noise. We 
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B: Maze dataset 
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Fig. 10 

A, B, C: Repeatability results for the repeatability dataset as the number of features per frame is 
VARIED. D: Key for this figure, Figure[TT]and Figure[9] For FAST and SUSAN, the number of features can 

NOT BE CHOSEN ARBITRARILY; THE CLOSEST APPROXIMATION TO 500 FEATURES IN EACH FRAME IS USED. 



support this claim. Although the noise results show that the performance drops quite rapidly 
with increasing noise to start with, it soon levels off and outperforms all but the DoG detector. 
The DoG detector is remarkably robust to the presence of noise. Convolution is linear, so 
the computation of DoG is equivalent to convolution with a DoG kernel. Since this kernel 
is symmetric, the convolution is equivalent to matched filtering for objects with that shape. The 
robustness is achieved because matched filtering is optimal in the presence of additive Gaussian 
noise MM- 
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TABLE IV 

Timing results for a selection of feature detectors run on frames of two video sequences. The 
percentage of the processing budget for 640 x 480 video is given for comparison. note that since pal, 
ntsc, dv and 30hz vga (common for web-cams) video have approximately the same pixel rate, the 
percentages are widely applicable. the feature density is equivalent to approximately 500 features per 
640 X 480 FRAME. The results shown include the time taken for nonmaximal suppression. 



B. Speed 

Timing tests were performed on a 3.0GHz Pentium 4-D which is representative of a modern 
desktop computer. The timing tests are performed on two datasets: the terst set and the training 
set. The training set consists 101 monochrome fields from a high definition video source with a 
resolution of 992 x 668 pixels. This video source is used to train the high speed FAST detectors 
and for profile-guided optimizations for all the detectors. The test set consists of 4968 frames 
of monochrome 352 x 288 (quarter-PAL) video 

The learned FAST-ER, FAST-9 and FAST- 12 detectors have been compared to the original 
FAST- 12 detector, to our implementation of the Harris and DoG (the detector used by SIFT) 
and to the reference implementation of SUSAN HlOlll . The FAST-9, Harris and DoG detectors 
use the SSE-2 vectorizing instructions to speed up the processing. The learned FAST- 12 does 
not, since using SSE-2 does not yield a speed increase. 

As can be seen in Table UVl FAST in general is mucxh faster than the other tested feature 
detectors, and the learned FAST is roughly twice as fast as the handwritten version. In addition. 
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it is also able to generate an efficient detector for FAST-9, which is the most reliable of the 
FAST-n detectors. Furthermore, it is able to generate a very efficient detector for FAST-ER. 
Despite the increased complexity of this detector, it is still much faster than all but FAST-n. 
On modem hardware, FAST and FAST-ER consume only a fraction of the time available during 
video processing, and on low power hardware, it is the only one of the detectors tested which 
is capable of video rate processing at all. 

VIL Conclusions 

In this paper, we have presented the FAST family of detectors. Using machine learning we 
turned the simple and very repeatable segment test heuristic into the FAST-9 detector which has 
unmatched processing speed. Despite the design for speed, the resulting detector has excellent 
repeatability. By generalizing the detector and removing preconceived ideas about how a comer 
should appear, we were able to optimize a detector directly to improve its repeatability, creating 
the FAST-ER detector. While still being very efficient, FAST-ER has dramatic improvements in 
repeatability over FAST-9 (especially in noisy images). The result is a detector which is not only 
computationally efficient, but has better repeatability results and more consistent with variation 
in comer density than any other tested detector. 

These results raise an interesting point about corner detection techniques: too much reliance 
on intuition can be misleading. Here, rather than concentrating on how the algorithm should do 
its job, we focus our attention on what performance measure we want to optimize and this yields 
very good results. The result is a detector which compares favourably to existing detectors. 

experiment freely available. The generated FAST-n detectors, the datasets for measuring 
repeatability, the FAST-ER leaming code and the resulting trees are available from 1^ 

|http : / /mi . eng . cam. ac.uk/ er258/work/fast. html 



^FAST-n detectors are also available in libCVD from: |http : // savannah . nongnu . org/pro jects/libcvd| 
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Fig. 11 

A-G: Repeatability results for the 'Oxford' dataset as the number of features per frame is varied. See 

Figure[To]for the key. 
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